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Mesorhizobium ciceri bv. biserrulae strain WSM12 71 was isolated from root nodules of the pasture 
legume Biserrula pelecinus growing in the Mediterranean basin. Previous studies have show^n this 
aerobic, motile. Gram negative, non-spone-forming rod preferably nodulatesB. pelecinus - a legume 
w^ith many beneficial agronomic attributes for sustainable agriculture in Australia. We describe the 
genome of Mesorhizobium ciceri bv. biserrulae strain WSM12 71 consisting of a 6,2 64,489 bp chro- 
mosome and a 425,539 bp plasm id that together encode 6,470 protein-coding genes and 61 RNA- 
only encoding genes. 



Introduction 

The productivity of sustainable agriculture around 
the world is heavily dependent on the provision of 
bioavailable nitrogen (N] [1]. The demand for N by 
non-leguminous and leguminous plants can be sup- 
plied by the application of chemically synthesized 
nitrogenous fertilizer onto crops and pastures. How- 
ever, the production of fertilizer is costly and re- 
quires the burning of fossil fuels in the manufactur- 
ing process which increases greenhouse gas emis- 
sions. Furthermore, high application rates of fertiliz- 
er can contaminate ecosystems and waterways, and 
result in leaching into the environment 

In contrast, the demand for N by leguminous 
plants can be sustainably met through the biologi- 
cal process of N fixation that occurs following the 
successful formation of an effective symbiosis. 
This symbiotic nitrogen fixation (SNF) process can 
account for approximately 70% of the bioavailable 
nitrogen supplied to legumes [1]. 

One legume that has many beneficial agronomic 
attributes is Biserrula pelecinus L, which is an an- 
nual herbaceous legume native to the Mediterra- 
nean basin that was introduced into Australian 



soil in 1994 [2]. The beneficial agronomic attrib- 
utes of this legume include drought tolerance, 
hard seed production, easy harvesting characteris- 
tics, insect tolerance and most importantly, a ca- 
pacity to grow well in the acidic duplex soils of 
Australia [2,3]. This monospecific legume specifi- 
cally forms an effective nitrogen fixing symbiosis 
with the root nodule bacterium Mesorhizobium 
ciceri bv. biserrulae type strain WSM1271T (= 
LMG23838 = HAMBI2942) [4,5]. Australian indig- 
enous rhizobial populations were found to be in- 
capable of nodulating B. pelecinus L [2]. However, 
within six years of the introduction of the inocu- 
lant into Australia, the in situ evolution of a di- 
verse range of competitive strains capable of 
nodulating B. pelecinus L. compromised optimal 
N2-fixation with this host. This rapid emergence of 
less effective strains threatens the establishment 
of this legume species in the Australian agricultur- 
al setting. The sub- optimal strains appear to have 
evolved from indigenous mesorhizobia that ac- 
quired the island of genes associated with symbio- 
sis from the original inoculant, WSM12 7V, follow- 
ing a horizontal gene transfer event [6]. 
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In this report, a summary classification and a set 
of general features for M. ciceri bv. biserrulae 
strain WSM1271T are presented together with the 
description of the complete genome sequence and 
its annotation. 



Classification and features 

M. ciceri strain WSM1271T is a motile. Gram- 
negative, non-spore-forming rod (Figure 1 and 
Figure 2) in the order Rhizobiales of the class 
Alphaproteobacteha. They are moderately fast 
growing, forming 2-4 mm diameter colonies with- 
in 3-4 days, and have a mean generation time of 4- 
6 h when grown in half Lupin Agar {VzLA] broth 
[7] at 28 °C. Colonies on VzLA are white-opaque, 
slightly domed, moderately mucoid with smooth 
margins (Figure 3). 

The organism tolerates a pH range between 5.5 
and 9.0. Carbon source utilization and fatty acid 
profiles have been described before [6]. Minimum 
Information about the Genome Sequence (MIGS) is 
provided in Table 1. 

Figure 4 shows the phylogenetic neighborhood of M. 
ciceri bv. biserrulae strain WSM1271T in a 16S rRNA 
sequence based tree. This strain clustered in a tight 
group, which included M. australicum, M. ciceri M. 
loti and M. shangrilense and had >99% sequence 
identity with all four type strains. Our polyphasic 
taxonomic study indicates that WSM1271T is a new 
biovarof nodulatingbacteria [5]. 



Symbiotaxonomy 

M. ciceri bv. biserrulae strain WSM1271T has an 
extremely narrow legume host range for symbio- 
sis only forming highly effective nitrogen-fixing 
root nodules on Biserrula pelecinus. L This strain 
also nodulates the closely related species 
Astragalus membranaceus but does not nodulate 
21 other legume species nodulated by 
Mesorhizobium spp [5]. The high degree of speci- 
ficity in the symbiotic relationships of this strain is 
representative of root nodule bacteria isolated 
from B. pelecinus L. growing in undisturbed land- 
scapes in the Mediterranean basin, and is an im- 
portant example of a highly specific legume host- 
root nodule bacteria relationship in an annual 
herbaceous legume used as a forage species in ag- 
riculture. 




Figure 1. Image of Mesorhizobium ciceri bv. biserrulae 
strain WSM127V using scanning electron microscopy. 
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Figure 2. Image of Mesorhizobium ciceri bv. 
biserrulae strain WSM12 71^ using transmission 
electron microscopy. 




Figure 3. Image of Mesorhizobium ciceri bv. 
biserrulae strain WSM12 71^ using the appearance of 
colony morphology on solid media. 
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Table 1. Classification and features of Mesorhizobium cicerihv. biserrulae strain WSM12 71^ according to the 
MIGS recommendations [8,9]. 



MIGS ID Property 



Term 



Evidence code 



MIGS-22 



MIGS-6 

MIGS-15 

MIGS-14 



MIGS-4 

MIGS-5 

MIGS-4.1 

MIGS-4.2 

MIGS-4.3 

MIGS-4.4 



Current classification 



Gram stain 

Cell shape 

Motility 

Sporulation 

Temperature range 

Optimum temperature 

Salinity 

Oxygen requirement 
Carbon source 
Energy source 
Habitat 

Biotic relationship 
Pathogenicity 
Biosafety level 
Isolation 

Geographic location 

Nodule collection date 

Longitude 

Latitude 

Depth 

Altitude 



Domain Bacteria TAS [9] 

Phylum Proteobacteria TAS [10] 

Class Alpliaproteobacteria TAS [11,12] 

Order Rhizobiales TAS [11,13] 

Family Phyllobacteriaceae TAS [11,14] 

Genus Mesorhizobium TAS [15] 

Species Mesorhizobium cicerihv biserrulae TAS [15] 

Negative TAS [6] 

Rod TAS [6] 

Motile TAS [6] 

Non-sporulating TAS [16] 

Mesophile TAS [16] 

28°C TAS [6] 

Unknown NAS 

Aerobic TAS [16] 
Arabinose, l^entibiose, glucose, mannitol & melibiose TAS [6] 

Chemoorganotroph TAS [16] 

Soil, root nodule, host TAS [6] 

Free living. Symbiotic TAS [6] 

None NAS 

1 TAS [1 7] 

Root nodule TAS [5,6] 

5 km before Bottida, Sardinia TAS [2,5] 

April 1993 TAS [4] 

9.012008 NAS 

40.382 709 NAS 

10 cm NAS 

295 m TAS [5] 



Evidence codes - TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non- 
traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally ac- 
cepted property for the species, or anecdotal evidence). Evidence codes are from the Gene Ontology project [18]. 
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S4 J Mesorhizobium loti MAFF303099 (Gc00040)* 

- Mesorhizobium opportunist urn WSM2075T (NR 074209, Gc01853) 
Mosorhizobium /o^^ NZP2037 (Gi08826) 

- Mesorhizobium huakuii LMG^A^D7'^ (D 13431) 

- Mesorhizobium pliMarium LMG 11892^ (Y14158) 
Mesorhizobium ausfray^cum WSM2073T (CP003358, Gc02468) 
-Mesorhizobium septentrionaie CCBAU 11014^ (AF508207) 
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Mesorhizobium ciceri HBRC 100389^ (AB681164) 
Mesorhizobium /ofj' LMG 6125T(X67229, Gi08881) 
■Mesorhizobium shangriiense CCBAU 65327^ (EU074203) 
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Mesorhizobium /o?/R88B (G 108827) 
Mesorhizobium aSbiziae CCBAU 61158T (DQ1 00066) 
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~^ Er^sifer melHoti LMG 6133T[X67222) 
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-Rhizobium mongolense USDA 1844T (Gi08900 U89817) 
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9S I Rhizobium leguminosarum bv. wc^ae USDA2370T (U29386, Gi06483) 

- A zorhizobium caulinodans 0 R S57 1 t (D 1 1 342) 

52 J — Brady rhizobium liaoningense LMG18230T (AJ250813) 
99 ^ Bradyrhizobium yuanmingense LMG 21827T(AF19381 8) 
\Bradyrhizobium japonicum USDAO^ (Gc02045, U69638) 
-Brady rhizobium canariense LMG 22265^ (AJ558025) 
- Brady rhizobium e/Aa™ USDA 76^ (AB509378, Gi08850) 

100 I Methylobacterium noduians ORS 2060^ (AF220763, Gc00935)* 
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Figure 4. Phylogenetic tree showing the relationships of Mesorhizobium ciceri bv. biserrulae 
WSM12 7V (shown in bold print) with root nodule bacteria in the order Rhizobiales based on 
aligned sequences of the 16S rRNAgene (1,290 bp internal region). All sites were informative and 
there were no gap -containing sites. Phylogenetic analyses were performed using MEGA [19]. The 
tree was built using the Maximum-Likelihood method with the General Time Reversible model. 
Bootstrap analysis [20] was performed with 500 replicates to assess the support of the clusters. 
Type strains are indicated with a superscript T. Brackets after the strain name contain a DNA da- 
tabase accession number and/or a GOLD ID (beginning with the prefix G) for a sequencing pro- 
ject registered in GOLD [21]. Published genomes are indicated with an asterisk. 
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Genome sequencing and annotation 

Genome project history 

The Joint Genome Institute QGI) operated by US 
Department of Energy (DOE) sequenced, finished 
and annotated WSM1271 as part of the Community 
Sequencing Program (CSP). The genome project is 
deposited in the Genomes OnLine Database [21]. 
The finished genome sequence is in GenBank. The 
CSP selects projects on the basis of environmental 
and agricultural relevance to issues in global car- 
bon cycling, alternative energy production, and bi- 



ogeochemical importance. Table 2 summarizes the 
project information. 

Growth conditions and DNA isolation 

M. ciceri bv. biserrulae strain WSM1271T was 
grown to mid logarithmic phase in TY rich medi- 
um [22] on a gyratory shaker at 28 °C. DNA was 
isolated from 60 mL of cells using a CTAB (Cetyl 
trimethyl ammonium bromide) bacterial genomic 
DNA isolation method [2 3]. 



Table 2. Genome sequencing project information for Mesorhizobium ciceri hv. biserrulae strain WSM1271^ 
MIGS ID Property Term 



MlGS-31 Finishing quality 
MIGS-28 Libraries used 



Finished 

illumina GAii shotgun library, 

454 Titanium standard library and paired end 454 libraries 



MIGS-29 Sequencing platforms 
MIGS-31.2 Sequencing coverage 
MIGS-30 Assemblers 
MIGS-32 Gene calling method 
Genbank ID 



Illumina and 454 technologies 
454 (26.8x) and Illumina (124x) 

Newbler, version 2.3 and Velvet version 0.7.63, PHRAP and CONSED 
Prodigal, GenePrimp 
CP002447, CP002448 
Genbank Date of Release November 10, 2012 
GOLD ID Gc01578 
NCBI project ID 48991 
Database: IMG 649633066 

Project relevance Symbiotic nitrogen fixation, agriculture 



Genome sequencing and assembly 

The Joint Genome Institute (JGI) generated the 
draft genome of M. ciceri bv. biserrulae 
WSM1271T using a combination of Illumina [24] 
and 454 technologies [25]. The sequencing of an 
Illumina GAii shotgun library generated 
23,461,369 reads totaling 844.6 Mb, a 454 Titani- 
um standard library which generated 277,881 
reads and a paired end 454 libraries with average 
insert size of 1.137 +/" 2.842 Kb and 4.378 +/- 
1.094 kb which generated 40,653 and 130,843 
reads totaling 244.0 Mb of 454 data. All general 
aspects of library construction and sequencing 
performed at the JGI can be found at the JGI web- 
site [23]. The initial draft assembly contained 32 
contigs in 2 scaffolds. The 454 Titanium standard 
data and the 454 paired end data were assembled 
together with Newbler, version 2.3. The Newbler 



consensus sequences were computationally 
shredded into 2 Kb overlapping fake reads 
(shreds). Illumina sequencing data was assembled 
with VELVET, version 0.7.63 [26], and the consen- 
sus sequences were computationally shredded 
into 1.5 Kb overlapping fake reads (shreds). We 
integrated the 454 Newbler consensus shreds, the 
Illumina VELVET consensus shreds and the read 
pairs in the 454 paired end library using parallel 
phrap, version SPS - 4.24 (High Performance 
Software, LLC). The software Consed [27-29] was 
used in the following finishing process. Illumina 
data was used to correct potential base errors and 
increase consensus quality using the software Pol- 
isher developed at JGI (Alia Lapidus, unpublished). 
Possible mis-assemblies were corrected using 
gap Resolution (Cliff Han, unpublished). 
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Dupfinisher [30], or sequencing cloned bridging 
PGR fragments with subcloning. Gaps between 
contigs were closed by editing in Gonsed, by PGR 
and by Bubble PGR (J-F Gheng, unpublished) pri- 
mer walks. A total of 49 additional reactions were 
necessary to close gaps and to raise the quality of 
the finished sequence. The total size of the ge- 
nome is 6,890,027 bp and the final assembly is 
based on 112.0 Mb of 454 draft data which pro- 
vides an average 26.8 x coverage of the genome 
and 832.1 Mb of Illumina draft data which pro- 
vides an average 124x coverage of the genome. 

Genome annotation 

Genes were identified using Prodigal [31] as part 
of the Oak Ridge National Laboratory genome an- 
notation pipeline, followed by a round of manual 
curation using the JGI GenePrimp pipeline [32]. 
The predicted GDSs were translated and used to 
search the National Genter for Biotechnology In- 
formation (NGBI) non-redundant database, 
UniProt, TIGRFam, Pfam, PRIAM, KEGG, GOG, and 



InterPro databases. These data sources were 
combined to assert a product description for each 
predicted protein. Non-coding genes and miscel- 
laneous features were predicted using tRNAscan- 
SE [33], RNAMMer [34], Rfam [35], TMHMM [36], 
and SignalP [37]. Additional gene prediction anal- 
yses and functional annotation were performed 
within the Integrated Microbial Genomes (IMG- 
ER) platform [38]. 

Genome properties 

The genome is 6,690,028 bp long with a 62.56% 
GG content (Table 3) and comprises a single 
chromosome and a single plasmid. From a total of 
6,531 genes, 6,470 were protein encoding and 61 
RNA only encoding genes. Within the genome, 206 
pseudogenes were also identified. The majority of 
genes (70.74%) were assigned a putative function 
while the remaining genes were annotated as hy- 
pothetical. The distribution of genes into GOGs 
functional categories is presented in Table 4, and 
Figures 5,6 and 7. 



Table 3. Genome Statistics for Mesorhizobium ciceribv. biserrulae strain WSM12 71^. 
Attribute Value % of Total 



Genome size (bp) 


6,690,028 


100.00 


DNA coding region (bp) 


5,791,860 


86.57 


DNA G+C content (bp) 


4,185,397 


62.56 


Number of replicons 


2 




Extrachromosomal elements 


1 




Total genes 


6,531 


100.00 


RNA genes 


61 


0.93 


Protein-coding genes 


6,470 


99.07 


Genes with function prediction 


4,620 


70.74 


Genes assigned to COGs 


5174 


79.22 


Genes assigned Pfam domains 


5398 


82.65 


Genes with signal peptides 


597 


9.14 


Genes with transmembrane helices 


1528 


23.40 
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Figure 5. Graphical circular map of the chromosome. From outside to the center: Genes on forward strand (color 
by COG categories as denoted by the IMG platform), Genes on reverse strand (color by COG categories), RNA 
genes (tRNAs green, sRNAs red, other RNAs black), GC content, GC skew. 
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Figure 6. Graphical circular map of the plasmid of Mesorhizobium cicerihv. biserrulae WSM12 71 . From 
outside to the center. Genes on forward strand (color by COG categories as denoted by the IMG platform), 
Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, other RNAs 
black), GC content, GC skew. 
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Figure 7. Color code for Figure 5 and 6. 
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Table 4. Number of protein coding genes of Mesorhizobium cicerihv. biserrulae 
WSM12 71^ associated with the general COG functional categories. 
Code Value %age COG Category 



J 


193 


3.35 


Translation, ribosomal structure and biogenesis 


A 


1 


0.02 


RNA processing and modification 


K 


492 


8.53 


Transcription 


L 


156 


2.71 


Replication, recombination and repair 


B 


6 


0.10 


Chromatin structure and dynamics 


D 


35 


0.61 


Cell cycle control, mitosis and meiosis 


Y 


0 


0.00 


Nuclear structure 


V 


63 


1.09 


Defense mechanisms 


T 


238 


4.13 


Signal transduction mechanisms 


M 


290 


5.03 


Cell wall/membrane biogenesis 


N 


62 


1.08 


Cell motility 


Z 


0 


0.00 


Cytoskeleton 


W 


2 


0.03 


Extracellular structures 


u 


124 


2.15 


Intracellulartrafficking and secretion 


o 


185 


3.21 


Posttranslational modification, protein turnover, chaperones 


c 


356 


6.17 


Energy production conversion 


G 


535 


9.28 


Carbohydrate transport and metabolism 


E 


732 


12.70 


Amino acid transport metabolism 


F 


92 


1.60 


Nucleotide transport and metabolism 


H 


204 


3.54 


Coenzyme transport and metabolism 


1 


235 


4.08 


Lipid transport and metabolism 


P 


2 74 


4.75 


Inorganic ion transport and metabolism 


Q 


175 


3.04 


Secondary metabolite biosynthesis, transport and catabolism 


R 


731 


12.68 


General function prediction only 


S 


585 


10.15 


Function unknov^n 




1,357 


20.78 


Not in COGS 
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